Search CORE

215 research outputs found

High-Dimensional Bayesian Geostatistics

Author: Banerjee Sudipto
Publication venue
Publication date: 01/01/2017
Field of study

With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has

\sim n

floating point operations (flops), where

n

the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

arXiv.org e-Print Archive

Ezid

eScholarship - University of California

Spatial Joint Species Distribution Modeling using Dirichlet Processes

Author: Banerjee Sudipto
Gelfand Alan E.
Shirota Shinichiro
Publication venue
Publication date: 21/09/2018
Field of study

Species distribution models usually attempt to explain presence-absence or abundance of a species at a site in terms of the environmental features (socalled abiotic features) present at the site. Historically, such models have considered species individually. However, it is well-established that species interact to influence presence-absence and abundance (envisioned as biotic factors). As a result, there has been substantial recent interest in joint species distribution models with various types of response, e.g., presence-absence, continuous and ordinal data. Such models incorporate dependence between species response as a surrogate for interaction. The challenge we focus on here is how to address such modeling in the context of a large number of species (e.g., order 102) across sites numbering in the order of 102 or 103 when, in practice, only a few species are found at any observed site. Again, there is some recent literature to address this; we adopt a dimension reduction approach. The novel wrinkle we add here is spatial dependence. That is, we have a collection of sites over a relatively small spatial region so it is anticipated that species distribution at a given site would be similar to that at a nearby site. Specifically, we handle dimension reduction through Dirichlet processes joined with spatial dependence through Gaussian processes. We use both simulated data and a plant communities dataset for the Cape Floristic Region (CFR) of South Africa to demonstrate our approach. The latter consists of presence-absence measurements for 639 tree species on 662 locations. Through both data examples we are able to demonstrate improved predictive performance using the foregoing specification

arXiv.org e-Print Archive

eScholarship - University of California

Bayesian State Space Modeling of Physical Processes in Industrial Hygiene

Author: Abdalla Nada
Arnold Susan
Banerjee Sudipto
Ramachandran Gurumurthy
Publication venue
Publication date: 05/07/2018
Field of study

Exposure assessment models are deterministic models derived from physical-chemical laws. In real workplace settings, chemical concentration measurements can be noisy and indirectly measured. In addition, inference on important parameters such as generation and ventilation rates are usually of interest since they are difficult to obtain. In this paper we outline a flexible Bayesian framework for parameter inference and exposure prediction. In particular, we propose using Bayesian state space models by discretizing the differential equation models and incorporating information from observed measurements and expert prior knowledge. At each time point, a new measurement is available that contains some noise, so using the physical model and the available measurements, we try to obtain a more accurate state estimate, which can be called filtering. We consider Monte Carlo sampling methods for parameter estimation and inference under nonlinear and non-Gaussian assumptions. The performance of the different methods is studied on computer-simulated and controlled laboratory-generated data. We consider some commonly used exposure models representing different physical hypotheses

arXiv.org e-Print Archive

eScholarship - University of California

Hierarchical spatial models for predicting tree species assemblages across large domains

Author: Banerjee Sudipto
Finley Andrew O.
McRoberts Ronald E.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 08/10/2009
Field of study

Spatially explicit data layers of tree species assemblages, referred to as forest types or forest type groups, are a key component in large-scale assessments of forest sustainability, biodiversity, timber biomass, carbon sinks and forest health monitoring. This paper explores the utility of coupling georeferenced national forest inventory (NFI) data with readily available and spatially complete environmental predictor variables through spatially-varying multinomial logistic regression models to predict forest type groups across large forested landscapes. These models exploit underlying spatial associations within the NFI plot array and the spatially-varying impact of predictor variables to improve the accuracy of forest type group predictions. The richness of these models incurs onerous computational burdens and we discuss dimension reducing spatial processes that retain the richness in modeling. We illustrate using NFI data from Michigan, USA, where we provide a comprehensive analysis of this large study area and demonstrate improved prediction with associated measures of uncertainty.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS250 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref